Improving Visualization of High-Dimensional Music Similarity Spaces
نویسنده
چکیده
Visualizations of music databases are a popular form of interface allowing intuitive exploration of music catalogs. They are often based on lower dimensional projections of high dimensional music similarity spaces. Such similarity spaces have already been shown to be negatively impacted by so-called hubs and anti-hubs. These are points that appear very close or very far to many other data points due to a problem of measuring distances in high-dimensional spaces. We present an empirical study on how this phenomenon impacts three popular approaches to compute twodimensional visualizations of music databases. We also show how the negative impact of hubs and anti-hubs can be reduced by re-scaling the high dimensional spaces before low dimensional projection.
منابع مشابه
Relationship-based Visualization of High-dimensional Data Clusters
In several real-life data mining applications, data resides in very high (> 1000) dimensional space, where both clustering techniques developed for low dimensional spaces (k-means, BIRCH, CLARANS, CURE, DBScan etc) as well as visualization methods such as parallel coordinates or projective visualizations, are rendered ineffective. This paper proposes a relationship based approach to clustering ...
متن کاملProbabilistic Combination of Features for Music Classification
We describe an approach to the combination of music similarity feature spaces in the context of music classification. The approach is based on taking the product of posterior probabilities obtained from separate classifiers for the different feature spaces. This allows for a different influence of the classifiers per song and an overall classification accuracy improving those resulting from ind...
متن کاملRelationship-Based Clustering and Visualization for High-Dimensional Data Mining
In several real-life data-mining applications, data reside in very high (1000 or more) dimensional space, where both clustering techniques developed for low-dimensional spaces (k-means, BIRCH, CLARANS, CURE, DBScan, etc.) as well as visualization methods such as parallel coordinates or projective visualizations, are rendered ineffective. This paper proposes a relationship-based approach that al...
متن کاملThe World of Music: SDP layout of high dimensional data
In this paper we investigate the use of Semidefinite Programming (SDP) optimization for high dimensional data layout and graph visualization. We developed a set of interactive visualization tools and used them on music artist ratings data from Yahoo!. The computed layout preserves a natural grouping of the artists and provides visual assistance for browsing large music collections. CR Categorie...
متن کاملExploring the relationship between feature and perceptual visual spaces
visual information (images or videos) is increasing and thereby demanding appropriate ways to represent and search these information spaces. Their visualization often relies on reducing the dimensions of the information space to create a lower-dimensional feature space which, from the point-of-view of the end user, will be viewed and interpreted as a perceptual space. Critically for information...
متن کامل